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Abstract 


Siglecs are attractive therapeutic target of the major homologous subfamily of I-type 
lectins. The primary role of Siglecs may actually lie in the recognition and phagocytosis of 
bacterial pathogens that express sialic acids, maintenance of myelin organization, and 
inhibition of neurite outgrowth, cell-cell interactions between neurons and glial cells etc. 
Siglec-2, a member of the Siglec family expressed on the surface of maturing B cells and B 
cell lymphomas and regulates signal transduction. In this work, 3-D structure of human 
Siglec-2 was predicted using molecular modeling techniques. The structure of the complex 
in solution of Siglec-2 with ligand, 6’-Sialyl-N-acetyl lactose (6’-SialylLacNAc) was predicted 
using a novel docking technique. The structural analysis of the complex and calculation of 
theoretical dissociation constant value will help to ascertain functional roles of such sugar 


binding protein. 
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Introduction 

The Siglecs are a specialized subgroup of 
the 
sialylated glycoconjugates (Crocker et al., 
1998). Sialic acid (NeuSAc) is an acidic, nine- 


carbon monosaccharide occurring glycocalyx 


lg super family that can recognize 


on the cell surface. Multi-cellular organisms 
use the sialic acid conjugates for non-specific 
electrostatic repulsion between cell types and 
to mediate cell adhesion, protein-protein 
interactions, and protein trafficking via sialic 
acid recognizing receptors. (Blixt et al., 2003; 


Kelm et al., 1994; Powell and Varki, 1994; 
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Brinkman-Van der Linden and Varki, 2000; 
Cornish et al., 1998; Nicoll et al., 1999; Varki, 
1997; Karlsson, 1998; Crocker et al., 1998; 
Crocker and Varki, 2001). Siglecs are type 1 
membrane proteins, recognizes sialylated 
glycoconjugates by an N-terminal sialic acid- 
binding V-set Ig domain which followed by a 
transmembrane domain, and a cytoplasmic 
tail and variable number of C2-set Ig-like 
domains (Angata et al., 2001). Siglecs can be 
divided into two subgroups: Sialoadhesin 


(Siglec-1), Siglec-2 (CD22), MAG (Siglec-4) and 
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Siglec-15 constitute one subgroup, share ~25— 
30% sequence identity in the extracellular 
region, and have divergent cytoplasmic tails 
and the second subgroup consists of the 
CD33-related Siglecs. They share 50-80% 
their 
conserved 


and have in 
highly 
Expression of each 


sequence _ similarity 


cytoplasmic tails two 
tyrosine-based motifs. 

human Siglec in a cell type-specific fashion, 
mainly in the hematopoietic and immune 
systems of humans, suggesting involvement in 
discrete functions ranging from regulation of 
neuronal cell growth and maintenance of 
myelination in the nervous system (MAG) (Li 
et al., 1994; Montag et al., 1994) and control 
of myeloid cell interactions (sialoadhesin) 
(Crocker et al., 1997) and CD33 (Freeman et 
al., 1995) to activation of B cells (CD22 [Cyster 


and Goodnow, 1997]. 


0 
HO 


Figure 1. Glyco-chain structure of the ligand 
used in this study. 





In the present study, | have predicted the 3- 
D structure of human Siglec-2 (hSiglec-2) 
along with the specific ligand, 6’-Sialy|LacNAc 
(Fig. 1). The of the 
predicted complex was done. The theoretical 


structural analysis 


dissociation constant value was_ also 


calculated for the complex which helped me 
to compare the relative binding affinity. 
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Materials and Methods 

The starting scaffold for modeling was the 
x-ray crystallographically determined 
structure of Mus Musculus (PDB ID: 1QFP). 
The initial structure of hSiglec-2 was obtained 
using the LOOPP (Teodorescu et al., 2004; 
Meller and Elber, 2001; Tobi and Elber, 2000) 
server due to less sequence homology with 
mSiglec-1 and the structure was refined using 
our in-house software package of ANALYN 
MODELYN (Mandal, 1998). 
structure of the Siglec-ligand complex was 


and Initial 
obtained by the superposition of the modeled 
hSiglec-2 structure with the experimental 
structure of Mus Musculus (PDB ID: 1QFO) 
followed by repeated energy minimization 
and dynamics simulations. The structure was 
refined using DISCOVER module of Insightll 
2005 of Accelrys (San Diego, CA). Structural 
optimization was done using cff91 force-field 
and energy minimization (100 steps each of 
steepest descent and conjugate gradient 
methods) followed by dynamics simulations. A 
typical dynamics run consisted of 100000 
steps of one femto-second after 1000 steps of 
equilibration with a conformational sampling 
of 1 in 100 steps at 300K. At the end of the 
dynamics simulation, lowest potential energy 
conformation with was picked for the next 
cycle of refinement using the module 
ANALYSIS of Insightll. This combination of 
dynamics and minimization were repeated 
until satisfactory conformational parameters 
were obtained. 

In order to investigate the influence of 
water on the ligand binding, water molecules 
were added as a sphere of radius 18A having 
its center at an atom roughly at the center of 
the ligand molecule so as to surround it 
completely using the Assembly/Soak option of 
the 
structure optimization of the ligand was done 


Insightll. In aqueous” environment, 
using energy minimization and molecular 


dynamics simulation in presence and absence 
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of the protein molecule. From the values of 
the free energies of complex formation of 
ligand in water and water-protein 
environments the absolute binding energy 
was calculated using the relation AGying = a 
A<V"..> + B A <V“™).> where AGping is the 
absolute binding energy, A stands _ for 
differences in the electrical (V"'..) and van der 
Waals (V““,.) components of the free 
energies of the ligand solvent (I-s) systems 
i.e., in pure water and protein containing 
water environments following the linear 
interaction energy approximation method of 
(Aqvist et al., 1994) The weight factors of the 
electrical and van der Waals contributions 
were taken respectively as 0.5 (a) and 0.16 (B) 
as proposed by Aqvist et al., and used by 
earlier workers (Aqvist and Mowbray, 1995; 
Hulte’n et al., 1997). Dissociation constant K, 
was calculated by taking the inverse of K, 
(association constant). K, was calculated using 
the thermodynamic relation AGping = -RTINK, 
where R is the ideal gas constant and T is the 
absolute temperature. 

MODELYN' was IBM- 


compatible PC in the windows environment 


run on both on 


and FUEL workstation of Silicon Graphics, Inc. 
in the IRIX environment. Altrix 350 server of 
Silicon Graphics, Inc. in the IRIX environment 
and FUEL workstation were used to run 
Insightll. The electrostatic potential surface of 
the protein was determined by MOLMOL 
(Koradi et al., 1996). 
structural parameters PROCHECK (Laskowski 
et al., 1993) was used. FUEL in the UNIX 
operating system was used to run both 
MOLMOL and PROCHECK. The binding affinity 
of the Siglec-ligand complex was obtained 
using the DOCKING module of Insightll. 
Structure of the ligand was generated using 
the BUILDER module of Insightll followed by 


For checking the 
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with 
minimization and molecular dynamics. 


optimization repeated energy 


Results 
General structural characteristics of the 
predicted model was. determined by 


measuring all the bond distances and bond 
angles and calculating the deviation of these 
parameters from the standard values for 
appropriate types of bonds and angles. The 
quality of backbone conformations were 
determined by calculating the phi and psi 
dihedral angles and drawing Ramachandran’s 
plots for the structure. Table 1 presents the 
RMSD (root mean square deviation) of bond 
lengths and bond angles of the predicted 
structure along with the percentages of 
backbone Phi-Psi angles in different areas of 
Ramachandran’s plots obtained after the 
prediction of 3D structures. 

RMSD from the respective standard values 
of the bond lengths around 0.02 A and those 
of bond angles around 3 degrees indicate 
good general structural parameters of the 
modeled structure. The good quality of the 
backbone conformations of the modeled 
structure indicated by the values of above 
95% Phi-Psi pairs in the core and allowed 
areas of Ramachandran’s plot. 

PROCHECK was used for side chain planarity 

in phenylalanine, 
histidine, 


glutamine, asparagines, glutamic acid, and 


of the planar groups 


tyrosine, tryptophan, arginine, 
aspartic acid and deviations from planarity 
were identified by measuring RMS(root mean 
square) distances of planar atoms from the 
best-fitted plane, residues having RMS 
distances >0.03A for rings and 0.02A for other 
groups were marked as outliers (Laskowski et 


al., 1993) (Table 2). 
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Table 1. General and backbone structural parameters of the modeled structure of the target 
sequence as well as the x-ray structures of the Siglecs. 




















Table 2. General and backbone structural parameters of the 




















Siglecs Accession | %of AA | RMS deviation % of Phi-Psi pairs in the area 
No Identity 
(positive | Bond | Angle | Core | Allowed | Generously Dis- 
score) (A) (°) allowed allowed 
mSiglec-1 | 1QFP 100 0.018 2.33 77.2 19.8 2.0 1.0 
mSiglec-1 | 1QFO 100 0.016 2.51 83.0 16.0 0.0 1.0 
hSiglec-2 AAB06448 16 0.023 2.85 65.7 28.4 5.9 0.0 





modeled structure of the target 





sequence in comparison with the x-ray structures of the Siglecs. 




















Siglecs Accession No All atom clashcore Rotamer Planarity 
(per 1000 atom) outliers (%) outliers (%) 
mSiglec-1 1QFP 3.16 3.42 0.0 
mSiglec-1 1QFO 3.26 4.81 0.0 
hSiglec-2 AAB06448 3.26 7.62 0.0 














Table 3. Empirical free energies, their difference in water and water-protein environments and 
corresponding AG and K, values for the complex formation between hSiglec-2 and the specific 
ligand in the aqueous solution. Abbreviations used in this table: VdW, van der Waals, Elect, 











electrical. 
Complex Free energy in kcals/mol Difference AGpbing in Ky 
Vdw Elect Total Vdw Elect kcals/mol 
Siglec-2-6'-SialyllacNAc | -73.45 | -208.06 : +6.53 | -10.78 -4,35 0.724 
in solution 281.51 mM 
6'- SialylLacNAc * -79.98 -197.28 > 
277.26 





























* Value corresponding to the interaction energy in presence of water molecules only. 


Table 4. Hydrogen-bond network within the binding site of hSiglec-2 in complex with 6’- 
SialylLacNAc. Distances are measured between hydrogen and acceptor or donor atom. 





Ligand—protein hydrogen-bonds 





Atoms of 6’-SialylLacNAc 
Neu5Ac 


Atoms of hSiglec-2 


Distance(A) 











O1A Lys-105:NZ 1.83 
O1B Arg-98:NH1 2.45 
010 Lys-105:N 2.48 
04 Thr-103:0 1.98 





Intra-molecular hydrogen-bonds 





Atoms of 6’-SialylLacNAc 


Atoms of 6’-SialylLacNAc 


Distance(A) 




















Neu5Ac 08 Neu5Ac 01B 2.19 
Nag O04 Nag O6 2.10 
Nag O7 Nag O1 1.78 
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Protein geometry of the modeled structure 
was checked by calculating clashcores and 
rotamer outliers using MOLPROBITY (Davis et 
al., 2004) (Table 2). Siglec-2 (CD22) is a B cell- 
specific glycoprotein of the Ig super family, 
highly expressed on the surface of maturing B 
cells and B cell lymphomas (Haas et al., 2006; 
Collins et al., 2006). Its extracellular domain 
contains seven lg domains, of which the 
N-terminal domain 


outermost recognizes 


sialic acid containing glycan _ ligands, 
specifically a(2,6)-linked sialic acid through 
which CD22 can induce cell adhesion if the 
ligand is expressed on target cells (Collins et 
al., 2006). a(2,6) Sia is a common N-linked 
terminal carbohydrate which is expressed on 
several glycoproteins in the serum and also on 
the surface of several cell types, among them 
lymphocytes (Ghosh et al., 2006). In addition 
to regulating signal transduction through its 
cytoplasmic domain, CD22 regulates B cell 
development and function through ligand- 
generated signals (Haas et al., 2006). | have 
modeled the structure of hSiglec-2 applying 
threading method of 3-D structure prediction 
in the 


materials and methods section. Due to less 


using LOOPP server as described 


sequence similarity (only 16% AA identity) 
with the mSiglec-1, homology modeling is not 
applicable for the 3-D structure prediction of 
hSiglec-2. As studied earlier (Blixt et al., 2003) 
a(2,6)  Sia-linked ligand 6’-SialylLacNAc 
(NeuAca2,6GalB1,4GIcNAc) has been docked 
into the binding site of the modeled structure 
to study the nature of interaction of the 
protein — ligand complex. At first ligand 
molecule was constructed using ‘BUILDER’ 
module of Insightll followed by optimization 
with 
dynamics simulations. After that the ligand 


repeated energy minimization and 


molecule was superposed taking the sialic 
acid part of the ligand with the equivalent 
part of the x-ray structure containing 3’- 
(1QFO). Then 


Sialyllactose the modeled 
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protein is superposed with the 3’-Sialyllactose 
bound protein (1QFO) with respect to the 
structurally conserved regions followed by 
transfer of the superposed 6’-SialylLacNAc 
molecule to the binding site. 


Thr,.103 





Figure 2. Mode of ligand binding in hSiglec-2: 
Ligand binding environment is shown in the 
secondary structure environment of the 
modeled lectin. Beta sheets are shown in 
yellow with an arrow indicating the C- 
terminus and random coils as thin cylinder 
coloured in maroon. The residues of the 
protein involved in hydrogen bonding with 
the ligand are shown in stick representation, 
coloured as atoms (C=Green, O=Red and 
N=Indigo) and the ligand 6’-SialylLacNAc is in 
red colour. 


Optimization of the structure of the 
resulting complex was done by repeated 
molecular dynamics and energy minimization 
in presence of water as described in the 
materials and methods section. Values of 
AG bind the 


interaction approximation as 


were calculated by linear 
energy 
described in materials and methods for the 
complex and presented in Table 3. It may be 
noted that the calculated AG,ji,g value for the 
complex of 6’-SialylLacNAc with hSiglec-2 is 
that the 


formation of 6’-Sialy|LacNAc with the protein 


negative indicating complex 
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in the aqueous medium is thermodynamically 
favorable. Values of AGying for 6’-SialylLacNAc 
-protein complex corresponds respectively to 
dissociation contestants (Ky) of 0.724 mM. 
The with the 
experimental findings (Blixt et al., 2003). The 


value is comparable 
essential interaction between Arg-97 and the 
sialic acid carboxylate group is conserved in 
the structure as reported in previous studies 
(Zaccai et al., 2003; May et al., 1998; 
Bukrinsky et al., 2004). Side chain of Lys-105 
and Thr-103 (Fig. 2 & Table 4) are also 
involved in direct hydrogen-bonding with the 
ligand. The bound conformation of the ligand 
is stabilized by the three intra-molecular 
hydrogen-bonding. 


Discussion 

| have modeled the 3-D structure of human 
Siglec-2. The predicted structure was refined 
to obtain best backbone and side chain 
conformations by executing repeated 
molecular dynamics and energy minimization 
and picking the most reliable structure. 
Although, the structural models do not cover 
the entire sequence of these biochemical 
lectins, which participate in many crucial 
phenomena of the mammalian life process, 
my predictions were limited only to the extent 
of the experimental structures available for 
proteins homologous to the Siglec-2. None- 
the 


important segments known to participate in 


the-less, structure encompassed the 
their biological activities. 

| have also predicted the structure of the 
complex of the modeled human Siglec-2 with 
the specific ligand, 6’-SialylLacNAc known so 
far from experimental studies. The nature of 
interactions of the ligand with the Siglec-2 
was examined in details in order to under the 
origin of their specificity at the atomic levels. 
The involvement of the crucial amino acids, 
identified by experimental techniques, was 
confirmed from the modeled structure by 
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exploring the involvement of evolutionary 
conserved amino acids. The participation of 
the various loop structures of the Siglec-2 in 
binding to the specific ligand was explored to 
understand their conformational implications. 
The the 
stability of the bound ligand was analyzed in 


chemical environment leading 
atomic details in presence of water molecules 
to simulate closely the aqueous environment. 
the 
modeled complex and compared with the 
Thus, 


studies using predicted model of human 


Binding constant was predicted for 


experimental values. my structural 
Siglec-2 and the complex with specific ligand, 
6'-SialylLacNAc have contributed significantly 
in understanding the interactions involving 
sialic acid containing bioactive molecules 
which are implicated in many important 


biochemical phenomena. 
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